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completed  projects  that  define  novel  dynamically  integrated  systems  for  image  understanding. 
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RESEARCH  PROJECTS 

INFORMATION  FUSION  AND  HIERARCHICAL  KNOWLEDGE  DISCOVERY  BY 
ARTMAP  NEURAL  NETWORKS  (Siegfried  Martens,  Ogi  Ogas,  Santiago  Olivera,  Gail 
Carpenter) 


Image  fusion  has  been  defined  as  “the  acquisition,  processing  and  synergistic  combination  of 
information  provided  by  various  sensors  or  by  the  same  sensor  in  many  measuring  contexts.” 
(Simone  et  al.,  2002,  p.  3)  When  multiple  sources  provide  inconsistent  data,  such  methods  are 
called  upon  to  select  the  accurate  information  components.  As  quoted  by  the  International 
Society  of  Information  Fusion:  “Evaluating  the  reliability  of  different  information  sources  is 
crucial  when  the  received  data  reveal  some  inconsistencies  and  we  have  to  choose  among  various 
options.”  For  example,  independent  sources  might  label  an  object  beach  or  road  or  river.  A 
fusion  method  could  address  this  problem  by  weighing  the  confidence  and  reliability  of  each 
source,  merging  complementary  information,  or  gathering  more  data.  In  any  case,  at  most  one  of 
these  answers  is  correct. 

This  project  has  introduced  a  novel  approach  to  the  information  fusion  problem,  with  new 
methods  that  derive  consistent  knowledge  from  sources  that  are  paradoxically  both  inconsistent 
and  accurate.  This  is  a  problem  that  the  human  brain  solves  well.  A  young  child  who  hears  the 
family  pet  variously  called  Spot,  puppy,  dog,  dalmatian,  mammal,  and  animal  is  not  only  not 
alarmed  by  these  conflicting  labels  but  readily  uses  them  to  infer  functional  relationships  that  are 
almost  never  explicitly  specified.  An  analogous  information  fusion  problem  seeks  to  classify  the 
terrain  and  objects  in  an  unfamiliar  territory  based  on  intelligence  supplied  by  several  reliable 
sources.  Each  source  labels  a  portion  of  the  region  based  on  sensor  data  and  observations 
collected  at  specific  times,  and  based  on  individual  goals  and  interests.  Different  sources  might 
label  a  given  pixel  beach,  plage,  open  space,  and  natural.  A  human  mapping  analyst  would,  in 
this  case,  be  able  to  apply  a  lifetime  of  experience  to  resolve  the  paradox  by  organizing  objects  in 
a  knowledge  hierarchy,  and  a  rule-based  expert  system  could  be  constructed  to  codify  this 
knowledge.  Alternatively,  an  analyst  might  be  faced  with  complex  or  unfamiliar  labels,  and  the 
structure  of  label  relationships  might  vary  from  one  test  region  to  the  next. 

An  ARTMAP  neural  network  can  derive  hierarchical  knowledge  structures  from  nominally 
inconsistent  training  data  (Carpenter,  Martens,  &  Ogas,  2005).  The  system  learns  that  disparate 
pixels  map  to  the  output  class  beach ;  but,  if  similar  or  identical  pixels  are,  at  other  times,  labeled 
plage  or  open  space  or  natural,  the  system  learns  to  associate  multiple  classes  with  a  given  input. 
Testbed  image  examples  have  shown  that  the  overall  pattern  of  distributed  predictions  can  reveal 
a  knowledge  hierarchy  which  guides  the  production  of  consistently  layered  maps  of  test  regions. 
Even  though  no  inter-class  relationships  are  specified  during  training,  the  system  uses  distributed 
activation  patterns  of  learned  codes  to  derive  knowledge  of  relationship  rules,  confidence 
estimates,  equivalence  classes,  and  hierarchical  structures  (Figure  1). 
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Figure  1:  (a)  Boston  testbed  image,  5.4km  x  9km.  Input  bands:  Landsat  (30m), 

panchromatic  (15m),  and  thermal  (60m).  The  image  is  divided  into  four  vertical  strips: 
two  for  training,  one  for  validation  (if  needed),  and  one  for  testing.  This  protocol 
produces  geographically  distinct  training  and  testing  areas,  to  assess  regional 
generalization.  Typically,  class  label  distributions  vary  substantially  across  strips,  and 
ground  truth  is  available  for  only  a  fraction  of  the  training  region,  (b)  For  the  Boston 
example,  the  ARTMAP  fusion  system  correctly  produces  all  class  rules  and  levels.  Rule 
confidence  estimates  appear  beside  the  arrows. 


CONFIGR  (COntour  Figure  GRound)  MODEL  FOR  LONG-RANGE  OBJECT 
COMPLETION  AND  FIGURE-GROUND  SEGMENTATION  (Chaitanya  Sai  Gaddam, 
Ennio  Mingolla,  Gail  Carpenter) 

In  two  ground-breaking  1985  articles  (Neural  dynamics  of  form  perception:  boundary 
completion,  illusory  figures,  and  neon  color  spreading-.  Neural  dynamics  of  perceptual 
grouping:  textures,  boundaries,  and  emergent  segmentations),  Stephen  Grossberg  and  Ennio 
Mingolla  introduced  the  Boundary  Contour  System  /  Feature  Contour  System  (BCS/FCS)  model. 
Over  the  past  two  decades,  the  original  model  has  been  extensively  developed  and 
experimentally  confirmed  in  the  domains  of  biological  vision  and  psychophysics,  and  aspects  of 
the  system  have  been  implemented  in  technological  applications.  However,  simulations  of 
BCS/FCS  properties  have  typically  been  limited  to  proof-of-concept  examples,  and  the  model’s 
potential  to  realize  the  computational  power  of  the  human  visual  system  has  barely  been  tapped. 

A  new  model  called  CONFIGR  (CONtour  Figure  GRound)  exhibits  many  BCS/FCS  capabilities 
within  a  system  that  performs  well  in  a  general-purpose  image  processing  environment. 
CONFIGR  efficiently  and  accurately  carries  out  the  visual  functions  of  boundary  finding,  long- 
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range  grouping,  and  filling-in  of  connected  featural  components.  CONFIGR  model  analyses 
have  also  fed  back  to  the  science,  introducing  new  hypotheses  concerning  key  neural 
mechanisms  of  early  vision. 

An  object  completion  testbed  demonstrates  CONFIGR  computations,  using  an  image  previously 
developed  for  the  analysis  of  information  fusion,  map  production,  and  target  recognition 
methodologies  (Parsons  &  Carpenter,  2003).  In  Figure  2,  a  default  ARTMAP  system  (Carpenter, 
2003)  provides  class  labels  for  individual  pixels,  based  on  local  image  data.  Objects  identified 
from  such  local  data  are  typically  incomplete.  In  particular,  roads  (red  pixels)  show  large  gaps, 
due,  for  example,  to  shadows  or  overhanging  trees.  This  example  demonstrates  how  the 
CONFIGR  model  fills  in  gaps  in  the  road  figure,  starting  with  incomplete  outputs  produced  by  a 
pixel-based  recognition  system.  CONFIGR  meets  the  challenge  of  calculating  correct  long-range 
completions  while  avoiding  spurious  interactions.  The  algorithm  also  meets  speed  requirements 
for  practical  implementation  with  large-scale  images. 

The  first  complete  CONFIGR  system  operates  on  square  binary  pixels,  which  implicitly  define  a 
computational  scale.  At  the  pixel  scale,  operations  of  filling-in  and  figure-ground  separation 
require  only  horizontal  and  vertical  orientations.  These  two  orientations  nonetheless  support 
long-range  object  completion  at  any  orientation,  since  the  completed  figure  may  include  the 
diagonal  of  an  arbitrary  rectangle  (Figure  2).  At  larger  scales,  where  the  smallest  independent 
units  are  pixel  clusters,  accurate  object  identification  and  figure-ground  separation  would  require 
long-range  completion  across  multiple  orientations.  Similarly,  the  analog  pixel  values  of  grey¬ 
scale,  color,  or  multi-spectral  images  would  require  additional  model  development.  Testbed 
applications  also  include  digitizing  paper  maps  and  fault  line  identification  from  remotely  sensed 
images.  Figure  3  illustrates  pilot  study  results  for  these  sample  applications. 
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Figure  2:  A  default  ARTMAP  network  maps  image  pixels  to  object  classes.  Figure  pixels, 
identified  from  local  sensor  information,  may  produce  gaps  in  structures  such  as  roads 
(red)  due,  for  example,  to  shadows  or  overhanging  trees.  The  CONFIGR  algorithm 
performs  long-range  figure  completion  (green).  Complementary  filling-in  of  ground 
pixels  (light  grey)  blocks  spurious  figure  conjunctions. 


Figure  3:  CONFIGR 

applications  include  digitizing 
paper  maps  and  identifying 
geological  structures  from 
remotely  sensed  images.  Pilot 
studies  (illustrated  here)  have 
demonstrated  the  feasibility  of 
these  technology  transfer  areas. 
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DISCOV  (Dimensionless  Shunting  COlor  Vision)  MODEL:  PHYSIOLOGY,  IMAGE 
PROCESSING,  AND  CLASSIFICATION  (Suhas  Chelian,  Ogi  Ogas,  Gail  Carpenter) 

The  DISCOV  (Dimensionless  Shunting  COlor  Vision)  system  models  a  cascade  of  primate  color 
vision  neurons:  retinal  ganglion,  thalamic  single  opponent,  and  two  classes  of  cortical  double 
opponents  (Figure  4)  (Chelian  &  Carpenter,  2005).  A  unified  model  formalism  derived  from 
psychophysical  axioms  produces  transparent  network  dynamics  and  principled  parameter 
settings.  DISCOV  fits  an  array  of  physiological  data  for  each  neuron  type,  and  makes  testable 
experimental  predictions  (Figure  5).  Pilot  studies  have  demonstrated  the  marginal  computational 
utility  of  each  model  neuron  on  recognition  tasks. 

Benchmark  testbeds  demonstrate  DISCOV  model  contributions  to  image  analysis.  In  particular, 
model  color  vision  neurons  respond  selectively  to  small  items  embedded  in  fields  of  various 
types,  as  Figure  4  illustrates  for  red-green  color  channels.  Image  examples  have  been  developed 
to  test  the  hypothesis  that  DISCOV  model  neurons  can  draw  attention  to  small  target  objects 
which  might  have  been  overlooked  by  more  traditional  image  processing  methods. 


double 

retinal  opponent  I 


single  double 

opponent  opponent  II 


Figure  3:  Preferred  stimuli  for  color  vision  neurons  in  red  ON-channels  of  the  retina  and 
for  red-green  channels  of  the  thalamus  (single  opponent)  and  cortical  area  VI  (double 
opponent). 
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(a)  Physiology 


images  cell  indices 


(b)  DISCOV 


images  cell  indices 


Figure  4:  Center  response  profiles  of  color  vision  neurons  (red  retinal  and  red-green 
single  opponent  and  double  opponent  I,  II)  from  (a)  physiology  and  (b)  the  DISCOV 
model.  Elements  of  4x4  arrays  in  the  top  row  indicate  the  center  and  surround  images 
for  16  experiments  for  each  neuron  type.  White  images  mix  all  colors  maximally;  black 
images  have  no  color  components.  Response  bins  represent  strong  positive  (white), 
intermediate  positive,  baseline,  intermediate  negative,  and  strong  negative  (black) 
activations,  respectively.  Orange  center  squares  in  (a)  represent  outcomes  that  are 
unreported  in  the  experimental  literature,  and  thus  correspond  to  model  predictions  (b). 
Except  for  a  reversal  (in  image  cells  9  and  10)  of  reported  double  opponent  II 
intermediate  positive  and  negative  responses,  DISCOV  matches  all  physiological  data. 


METHODS  FOR  LARGE-SCALE  DATA  MINING:  PREDICTING  HIV  RESISTANCE 
TO  ANTIRETROVIRAL  THERAPY  (Timothy  McKenna,  Matthew  Woods,  Gail  Carpenter; 
Harvard  School  of  Public  Health:  Victor  De  Gruttola,  Alex  Macalalad;  Brown  University: 
Kenneth  Mayer;  Stanford  University:  Robert  Shafer) 

This  project  extends  an  ongoing  interdisciplinary  collaboration  between  CNS  and  medical 
faculty  at  the  Harvard  School  of  Public  Health.  The  project  has  developed  and  compared 
methods,  including  regression,  classification  trees,  and  neural  networks,  to  predict  both  clinical 
and  in  vitro  HIV  resistance  to  various  antiretroviral  therapies  (phenotype)  from  a  patient’s 
genotype.  CNS  contributions  include  comparative  analyses  of  neural  network  methods  and  the 
introduction  and  evaluation  of  data  representations.  The  latter  is  enabling  new  consideration  of 
the  question  of  how  the  nature  of  a  mutation  affects  drug  responses,  whereas  previous 
representations  were  restricted  to  the  location  alone. 
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NEW  BENCHMARK  PROBLEMS  FOR  IMAGE  ANALYSIS 

This  project  has  introduced  three  sets  of  image-based  benchmark  problems  to  enable  the 
development  of  new  systems  for  research  projects  funded  by  this  grant.  In  order  to  promote 
technology  transitions,  comparative  analyses,  and  further  system  development,  testbed  problems 
will  be  documented  and  posted  on  the  web  as  part  of  the  CNS  Image  Processing  Toolkit. 

(a)  Mapping  benchmarks 

In  collaboration  with  the  Boston  University  Center  for  Remote  Sensing:  Farouk  El-Baz 
(Director),  Sucharita  Gopal  (Professor),  Magaly  Koch  (Research  Associate  Professor) 
http://www.bu.edu/remotesensing 

CNS  PhD  students:  Chaitanya  Sai  Gaddam,  Arun  Ravindran 

Since  the  early  1990s,  faculty  from  the  Boston  University  Center  for  Remote  Sensing  and  CNS 
have  collaborated  on  projects  that  bring  neural  modeling  methods  to  imaging  problems.  The 
challenges  posed  by  these  problems  have  also  helped  constrain  and  direct  model  development  for 
technological  applications  in  other  domains.  This  project  has  developed  challenge  problems  for 
figure-ground  segmentation  and  rule  discovery.  Current  topics  of  exploratory  investigation 
include  hyperspectral  imagery  and  the  problem  of  automatically  digitizing  paper  maps. 

(b)  Biomedical  imaging  benchmarks 

In  collaboration  with  the  Boston  University  Center  for  Biomedical  Imaging:  Dae-Shik  Kim 
(Director) 

http://www.bumc.bu.edu/Dept/Home. aspx?DepartmentID=420 
CNS  PhD  student:  Angela  Chapman 

A  new  collaboration  between  the  Boston  University  Center  for  Biomedical  Imaging  and  CNS  is 
a  rich  source  of  multi-modal  imagery  and  medical  expertise.  An  initial  study  of  image  fusion 
methods  is  developing  benchmark  problems  from  magnetic  resonance  imaging  (MR1) 
measurements,  which  produce  data  in  at  least  five  different  modalities,  each  reflecting  a  different 
aspect  of  brain  function  or  anatomy.  Image  fusion,  rule  discovery,  and  feature  selection  methods 
will  be  applied  and  further  developed  to  integrate  these  layers  of  information.  Since  individual 
MR!  subjects  will  be  scanned  monthly,  this  benchmark  also  presents  the  opportunity  to  develop 
methods  for  exploiting  the  temporal  evolution  of  multi-modal  data. 

(c)  Fenway  image  benchmark 

CNS  PhD  students:  Ogi  Ogas,  Santiago  Olivera.  High  school  intern:  Timothy  St.  Clair 
The  website  Massachusetts  Geographic  Information  Systems  (http://www.mass.gov/mgis/) 
provides  many  layers  of  state-wide  imagery  for  public  use.  This  site  is  the  source  of  orthophoto 
data,  at  0.5 m  resolution,  for  a  new  benchmark  image  called  Fenway,  selected  to  typify  an  urban 
setting  (Figure  6).  A  companion  registered  LANDSAT  image,  at  30m  resolution,  facilitates  the 
study  of  multi-scale  vision  and  image  processing  systems.  Ground  truth  identification,  problem 
development,  and  evaluation  are  facilitated  by  the  location  of  the  CNS  department  within  the 
designated  area.  This  image  has  been  developed  to  test  model  capabilities  for  directing  attention 
to  small  objects  such  as  cars  in  a  scene. 
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Figure  6:  The  Fenway  benchmark  image  (left)  is  a  fragment  of  an  orthophoto  image  of 
the  Boston-area  (right). 


SOFTWARE  DEVELOPMENT:  CLASSIFIER  SIMULATION  MANAGER  (CLASSER) 
AND  THE  CNS  IMAGE  PROCESSING  TOOLKIT  (CNS  IPT)  (Siegfried  Martens,  Ogi 
Ogas,  Santiago  Olivera,  Timothy  St.  Clair,  Ennio  Mingolla,  Gail  Carpenter) 

Whereas  versions  of  ART  and  BCS  models  have  been  used  in  a  wide  variety  of  applications, 
many  more  systems  that  were  developed  primarily  in  the  scientific  context  have  been  applied 
only  to  proof-of-concept  examples.  Current  efforts  in  the  CNS  Vision  and  Technology  Labs  are 
seeking  to  bridge  the  gap  between  science  and  technology  through  analysis,  testing,  and 
development  of  system  variations  with  a  view  to  large-scale  applications.  This  project 
complements  these  enterprises  by  focusing  on  the  development  and  distribution  of  open-source 
software,  in  addition  to  collaborative  research  on  new  systems. 

CLASSER  (CLASSifier  Simulation  ManagER)  is  a  new  modular  set  of  software  tools  that 
provide  a  user  with  classifier  implementations  while  handling  details  of  data  management  and 
collection  of  test  results.  CLASSER  provides  a  high-level  system  interface  for  learning 
applications,  allowing  the  user  to  work  with  entire  data  sets  at  a  time  instead  of  individual  points, 
and  automating  the  collection  of  output  results.  The  software  facilitates  neural  algorithm 
implementations  in  both  the  user’s  application  setting  and  in  the  Leica  Geosystems  ERDAS 
IMAGINE  environment.  Downloads  of  Version  1.1  of  CLASSER  and  its  first  interface, 
CLASSER  Script,  are  now  available: 

http://profusion.bu.edu/techlab/modules/mydownloads/viewcat.php?op=&cid=49  . 

The  CNS  Image  Processing  Toolkit  ( CNS  IPT)  is  a  modular  set  of  Java  tools  that  support  the 
development  of  computational  models  based  on  the  human  visual  system,  as  well  as  the  analysis 
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and  technology  transitions  of  these  models.  Toolkit  functions  are  complementary  to  and  linked 
with  the  CLASSER  software.  The  CNS  Image  Processing  Toolkit  currently  exists  as  a  pilot 
project  (Figure  7).  The  Toolkit  includes  the  testbeds  developed  projects  funded  by  this  grant,  as 
well  as  a  set  of  elementary  test  images  to  be  used  as  consistent  benchmarks  for  the  development 
and  comparative  analysis  of  alternative  methods. 
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Figure  7:  Development  page 
from  the  CNS  Image  Processing 
Toolkit. 
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